Goto

Collaborating Authors

 universum sample




Granular Ball Twin Support Vector Machine with Universum Data

Ganaie, M. A., Ahire, Vrushank

arXiv.org Artificial Intelligence

Innovative Data Representation with Granular Balls: The GBU-TSVM model employs an innovative approach by representing data instances as granular balls rather than conventional points. This method improves the model's robustness and efficiency, especially in handling noisy and large datasets. By grouping data points into granular balls, the model achieves better computational efficiency, increased noise resistance, and enhanced interpretability, establishing a new standard in data representation. Enhanced Generalization using Universum Data: The GBU-TSVM incorporates Universum data, which includes samples outside the target classes, to significantly improve generalization capabilities. Universum data enables the classifier to perform better on benchmark datasets, demonstrating the model's ability to utilize additional knowledge for more precise predictions. Refined Learning with Modified Hinge Loss Function: The model includes an advanced hinge loss function that accounts for the radii of granular balls, leading to a more accurate error measure and learning process. This modification allows for a detailed error assessment, enhancing the model's learning efficiency and decision boundary precision. By addressing the limitations of existing TSVM models, this innovation sets a new benchmark in the field of machine learning classifiers.


DOC3-Deep One Class Classification using Contradictions

Dhar, Sauptik, Torres, Bernardo Gonzalez

arXiv.org Artificial Intelligence

This paper introduces the notion of learning from contradictions (a.k.a Universum learning) for deep one class classification problems. We formalize this notion for the widely adopted one class large-margin loss, and propose the Deep One Class Classification using Contradictions (DOC3) algorithm. We show that learning from contradictions incurs lower generalization error by comparing the Empirical Radamacher Complexity (ERC) of DOC3 against its traditional inductive learning counterpart. Our empirical results demonstrate the efficacy of DOC3 algorithm achieving > 30% for CIFAR-10 and >50% for MV-Tec AD data sets in test AUCs compared to its inductive learning counterpart and in many cases improving the state-of-the-art in anomaly detection.


Single Class Universum-SVM

Dhar, Sauptik, Cherkassky, Vladimir

arXiv.org Artificial Intelligence

This paper extends the idea of Universum learning [1, 2] to single-class learning problems. We propose Single Class Universum-SVM setting that incorporates a priori knowledge (in the form of additional data samples) into the single class estimation problem. These additional data samples or Universum belong to the same application domain as (positive) data samples from a single class (of interest), but they follow a different distribution. Proposed methodology for single class U-SVM is based on the known connection between binary classification and single class learning formulations [3]. Several empirical comparisons are presented to illustrate the utility of the proposed approach.


Multiclass Universum SVM

Dhar, Sauptik, Cherkassky, Vladimir, Shah, Mohak

arXiv.org Machine Learning

We introduce Universum learning for multiclass problems and propose a novel formulation for multiclass universum SVM (MU-SVM). We also propose an analytic span bound for model selection with almost 2-4x faster computation times than standard resampling techniques. We empirically demonstrate the efficacy of the proposed MUSVM formulation on several real world datasets achieving > 20% improvement in test accuracies compared to multi-class SVM.


Selecting Informative Universum Sample for Semi-Supervised Learning

Chen, Shuo (Tsinghua University) | Zhang, Changshui (Tsinghua University)

AAAI Conferences

The Universum sample, which is defined as the sample that doesn't belong to any of the classes the learning task concerns, has been proved to be helpful in both supervised and semi-supervised settings. The former works treat the Universum samples equally. Our research found that not all the Universum samples are helpful, and we propose a method to pick the informative ones, i.e., in-between Universum samples. We also set up a new semi-supervised framework to incorporate the in-between Universum samples. Empirical experiments show that our method outperforms the former ones.